Analysis of market access in Somalia. We will look at food market locations in Somalia and assess their accessibilty by regular sampled points.
This section for reproducibility. We use R in the version 4.2.2. In
order to be able to reproduce this analysis you need an openrouteservice
API key. Get one here. Put it in a text
file file called config.txt in the working directory. It
has no structure, just the key in the first line.
ZXCVBNMLKJHGFDSAQWERTYUIOP
Almost all of the libraries can be installed via CRAN with the
command install.packages('packagename'). The
openrouteservice package is not on CRAN and needs to be installed via
github. With the command
remotes::install_github("GIScience/openrouteservice-r");
Also tmap in its most recent version 4 is not yet available on CRAN so
we need to download and install it via github too.
# main libraries
library(tidyverse)
library(glue)
library(sf)
library(units)
library(ggplot2)
library(tictoc)
library(openrouteservice) # remotes::install_github("GIScience/openrouteservice-r");
library(jsonlite)
library(terra)
library(exactextractr)
library(tmap) # remotes::install_github("r-tmap/tmap")
library(leaflet)
library(furrr)
library(purrr)
library(kableExtra)
library(mapsf)
library(RColorBrewer)
library(geonode4R)
library(scales)
#config <- readLines("config.txt")
#api_key <- config[1]
#user <- config[2]
#password <- config[3]
api_key <- ""
# Function to adjust a list of coordinates to the location of the closest road segment via the snapping endpoint of ORS. ORS itself only allows for a maximum distance of 350m to snap.
ors_snap <- function(x, rowids, local=F) {
x=coord_list
rowids=grid5km$rowid
local = T
library(httr)
# Define the body
body <- list(
locations = x,
radius = 100000000 # apparently the maximum snapping distance is 5km
)
# Define the headers
headers <- c(
'Accept' = 'application/json, application/geo+json, application/gpx+xml, img/png; charset=utf-8',
'Authorization' = api_key,
'Content-Type' = 'application/json; charset=utf-8'
)
endpoint <- if(local){'http://0.0.0.0:8080/ors/v2/snap/driving-car'} else {'https://api.openrouteservice.org/v2/snap/driving-car'}
# Make the POST request
response <- POST( endpoint,
#'https://api.openrouteservice.org/v2/snap/driving-car',
body = body,
add_headers(headers),
encode = "json")
resp_content <- content(response)
extract_data <- function(x, index) {
if (is.null(x)) {
return(data.frame(
rowid = index,
lon = NA,
lat = NA,
snapped_distance = NA
))
} else {
return(data.frame(
rowid = index,
lon = x$location[[1]],
lat = x$location[[2]],
snapped_distance = x$snapped_distance
))
}
}
# Extract data from the list and create a dataframe
df <- do.call(rbind, lapply(seq_along(resp_content$locations), function(i) extract_data(resp_content$locations[[i]], i)))
df$rowid <- rowids
# Convert the dataframe to an sf object
sf_df <- df |>
drop_na(lon, lat) |> # Drop rows with NA coordinates
st_as_sf(coords = c("lon", "lat"), crs = 4326) |> # Define the coordinate columns and set CRS
st_sf()
return(sf_df)
}
The workflow is as follows:
By snapping we refer to the process of changing the location of an origin or destination of a route to the nearest road segment. This is necessary because the openrouteservice API only allows for a maximum distance of 350m to snap. However in the context of Somalia greater snapping distances are desired as the road density in some regions might be low.
We use the following datasets as inputs:
pop <- rast('data/som_ppp_2020_UNadj_constrained.tif')
markets_som <- read_csv("data/wfp_food_prices_som.csv")
admin <- st_read('data/Som_Admbnda_Adm2_UNDP.shp', quiet = T)
In order to use the market locations we need to create a sf object by converting the latitude and longitude columns to a point geometry column. By checking the distinct locations of the geometry column we got 44 food markets in Somalia.
# Clean the latitude and longitude columns
markets_som_clean <- markets_som |>
filter(!is.na(latitude) & !is.na(longitude)) |>
slice(-1) |>
mutate(latitude = as.numeric(latitude),
longitude = as.numeric(longitude),
date = as.Date(date, format = "%Y-%m-%d"))
#add year
markets_som_clean <- markets_som_clean |>
mutate(year = as.numeric(format(date, "%Y")))
# Convert to sf object
markets_som_sf <- markets_som_clean |>
st_as_sf(coords = c("longitude", "latitude"), crs = 4326)
#unique markets
markets_som_asc_sf <- markets_som_sf |>
distinct(geometry, .keep_all = TRUE)
#unique markets descending
markets_som_desc_sf <- markets_som_sf |>
arrange(desc(row_number())) |> # Sort the dataframe such that the last rows come first
distinct(geometry, .keep_all = TRUE)
#combine both
markets_som_sf <- markets_som_desc_sf |> left_join(markets_som_asc_sf |> st_drop_geometry() |> select(market, year), by ="market")
#prepare final market sf
markets_som_sf <- markets_som_sf |> rename(first_year = year.y, last_year = year.x) |>
select(-date) |>
mutate(rowid = row_number())
Finding the shortest/fastest way between a number of origins and destinations is a job for the Matrix endpoint of ors. Lets start with the grid sampling. We decide for a 5000m width hexagonal grid. A hexagonal grid is closer to a circle and therefore the better joice for our analysis. Within each grid cell we will use the centroid and search for road network locations to snap to.
# create 5km grid
grid5km <-
st_make_grid(admin |> st_transform(20538), cellsize = 5000, square = F, flat_topped = T)
# convert to sf df
grid5km <- grid5km |> st_as_sf()
names(grid5km) <- 'geometry'
st_geometry(grid5km) <- 'geometry'
# get rid of all cells that are not within somalia
grid5km <-
grid5km |> st_transform(4326) |> st_filter(admin |> st_union())
# join admin codes
grid5km <-
grid5km |> st_join(admin |> select(admin2Name, admin2Pcod, admin1Name, admin1Pcod, admin0Name, admin0Pcod),
left = T,
largest = T)
# join pop information
grid5km$pop <- exact_extract(pop, grid5km, 'sum', progress = F)
# add rowid
grid5km <- grid5km |> mutate(rowid = row_number())
#filter for pop > 0
grid5km <- grid5km |> filter(pop > 0)
Time for this code chunk to run: 323.11 seconds
The grid is created, now snap the centroids.
# convert
coord_list <-
lapply(grid5km$geometry |> st_centroid(), function(x)
c(st_coordinates(x)[, 1], st_coordinates(x)[, 2]))
# snap the list of cords
snapped_coords <- ors_snap(coord_list,rowids=grid5km$rowid, local = T)
# join back
grid5km_snapped <- grid5km |>
left_join(snapped_coords |> tibble(), by = "rowid") |>
mutate(centroid = st_centroid(geometry.x))
# refactor names in order to keep, the grid geometry, centorid and snapped centroid
names(grid5km_snapped)[10:11] <- c('geometry', 'snapped_centroid')
st_geometry(grid5km_snapped) <- 'geometry'
grid5km_snapped <- grid5km_snapped |> mutate(snapped = ifelse(is.na(snapped_distance), F, T))
#filter for snapped values
grid5km_snapped_NA <- grid5km_snapped |> filter(snapped == F)
grid5km_snapped_notNA <- grid5km_snapped |> filter(snapped == T)
Time for this code chunk to run: 22.18 seconds
Awesome we now got 17738 snapped and 3045 not snapped grid cells.
In the next chunk we will calculate the travel times and distances from each grid cell to each market. We will use the openrouteservice API for this.
## run 1 / 44: 5.313 sec elapsed
## run 2 / 44: 5.063 sec elapsed
## run 3 / 44: 4.532 sec elapsed
## run 4 / 44: 5.419 sec elapsed
## run 5 / 44: 5.091 sec elapsed
## run 6 / 44: 4.777 sec elapsed
## run 7 / 44: 4.941 sec elapsed
## run 8 / 44: 4.634 sec elapsed
## run 9 / 44: 4.092 sec elapsed
## run 10 / 44: 5.024 sec elapsed
## run 11 / 44: 6.178 sec elapsed
## run 12 / 44: 4.925 sec elapsed
## run 13 / 44: 4.724 sec elapsed
## run 14 / 44: 5.053 sec elapsed
## run 15 / 44: 4.62 sec elapsed
## run 16 / 44: 4.523 sec elapsed
## run 17 / 44: 5.523 sec elapsed
## run 18 / 44: 5.734 sec elapsed
## run 19 / 44: 4.811 sec elapsed
## run 20 / 44: 4.703 sec elapsed
## run 21 / 44: 4.855 sec elapsed
## run 22 / 44: 5.113 sec elapsed
## run 23 / 44: 4.572 sec elapsed
## run 24 / 44: 5.971 sec elapsed
## run 25 / 44: 7.065 sec elapsed
## run 26 / 44: 4.897 sec elapsed
## run 27 / 44: 5.225 sec elapsed
## run 28 / 44: 4.904 sec elapsed
## run 29 / 44: 4.302 sec elapsed
## run 30 / 44: 5.031 sec elapsed
## run 31 / 44: 6.649 sec elapsed
## run 32 / 44: 5.31 sec elapsed
## run 33 / 44: 4.482 sec elapsed
## run 34 / 44: 5.13 sec elapsed
## run 35 / 44: 4.295 sec elapsed
## run 36 / 44: 4.818 sec elapsed
## run 37 / 44: 4.889 sec elapsed
## run 38 / 44: 4.497 sec elapsed
## run 39 / 44: 4.831 sec elapsed
## run 40 / 44: 4.984 sec elapsed
## run 41 / 44: 4.568 sec elapsed
## run 42 / 44: 4.786 sec elapsed
## run 43 / 44: 4.73 sec elapsed
## run 44 / 44: 4.452 sec elapsed
Awesome we now have the distances from all 44 markets to all grids. The resulting dataframe has 780472 rows. Now we want to find the market that is closest from every grid cell in terms of traveltime.
In the following chunks some visualizations are executed to assess the results.
What is the relation between Traveltime and population for each grid cell?
Maps of population distribution, travel time to the closest market and the closest market with respective administrative boundaries and the markets locations.
## tmap mode set to 'view'